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Abstract 

In general, one often encounters the systems that have sparse impulse response, with time varying system sparsity. Conventional 
adaptive filters which perform well for identification of non-sparse systems fail to exploit the system sparsity for improving the 
performance as the sparsity level increases. This paper presents a new approach that uses an adaptive convex combination of 
Affine Projection Algorithm (APA) and Zero-attracting Affine Projection Algorithm (ZA-APA)algorithms for identifying the sparse 
systems, which adapts dynamically to the sparsity of the system. Thus works well in both sparse and non-sparse environments and 
also the usage of affine projection makes it robust against colored input. It is shown that, for non-sparse systems, the proposed 
combination always converges to the APA algorithm, while for semi-sparse systems, it converges to a solution that produces lesser 
steady state EMSE than produced by either of the component filters. Eor highly sparse systems, depending on the value of the 
proportionality constant (p) in ZA-APA algorithm, the proposed combined filter may either converge to the ZA-APA based filter 
or produce a solution similar to the semi-sparse case i.e., outerperforms both the constituent filters. 

Index terms-Sparse Systems, li Norm, Compressive Sensing, Excess Mean Square Error. 

I. Introduction 

Usually, many real-life systems exhibit sparse representation i.e., their system impulse response is characterized by small 
number of non zero taps in the presence of large number of inactive taps. Sparse systems are encountered in many important 
practical applications such as network and acoustic echo cancelers m-E. HDTV channels [J, wireless multipath channels 
underwater acoustic communications [5]. The conventional system identification algorithms such as LMS and NLMS are 
sparsity agnostic i.e., they are unaware of underlying sparsity of the system impulse response. Recent studies have shown that 
the a priori knowledge about the system sparsity, if utilized properly by the identification algorithm, can result in substantial 
improvement in its estimation performance. This resulted in a flurry of research activities in the last decade or so towards 
developing sparsity aware adaptive Alter algorithms, notable amongst them being the Proportionate Normalized LMS (PNLMS) 
algorithm [6] and its variants 0-i- These current regressor based algorithms exhibit slower convergence rate for the correlated 
input. As a solution, these proportionate-type concepts were extended to data reuse case to provide the uniform convergence 
rate for both white and correlated input |9j. 
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On the other hand, drawing from the ideas of compressed Sensing (CS) m-m^ several new sparse adaptive filters have 
been proposed in the last few years, notably, the zero attracting LMS (ZALMS) [T3] derived by incorporating the li norm 
penalty into the LMS cost function which was later extended to ZA-APA [14]. The simplicity of using li norm penalty makes 
their implementation extremely simple. These algorithms perform well for the systems that are highly sparse, but struggle as 
the system sparsity decreases. That means these algorithms cannot perform well as the system sparsity varies widely over time. 

In [T5|, an improved PNLMS (IPNLMS) algorithm, which is a controlled mixture of the PNLMS and the NLMS is proposed 
to handle the variable system sparsity. In [16], an adaptive convex combination of two IPNLMS filters is proposed that can 
adapt to situations where the system sparsity is time varying and unknown. However, it provides the steady state MSD which 
is same as that of conventional sparse agnostic filters. Reweighted ZALMS (RZALMS) [T3] addresses this issue by selecting 
the shrinkage parameter p, in a tricky fashion at the cost of increased complexity. As a solution to this problem, in m a 
convex combination of LMS and ZALMS algorithm has been proposed that switches between ZALMS and LMS adaptive filters 
depending on the sparsity level, thus enjoying the robustness against time-varying system sparsity with reduced complexity 
compared to the RZALMS. However, the aforementioned mechanisms that support time varying sparsity could not perform 
well in colored (correlated) input signal condition. 

In this paper, we propose an alternative method that enjoys the robustness against time-varying system sparseness as well 
as coloredness of (correlated) input signal by using an adaptive convex combination of the Affine Projection Algorithm (APA) 
and Zero Attracting Affine Projection Algorithm (ZA-APA). The proposed algorithm uses the general framework of convex 
combination of adaptive filters and requires less complexity. The performance of the proposed scheme is first evaluated 
analytically by carrying out the convergence analysis. This requires evaluation of the steady state cross correlation between the 
output a priori errors generated by the APA and the ZA-APA based filters and then relating it to the respective steady state 
EMSE of the two filters. In our analysis, we have carried out this exercise for three different kinds of systems, namely, non- 
sparse, semi-sparse and sparse. The analysis shows that the proposed combined filter always converges to the APA based filter 
for a non-sparse system, while for semi-sparse systems, it converges to a solution that can outperform both the constituents. 
For sparse systems, the proposed scheme usually converges to the ZA-APA based filter. However, by adjusting the regularized 
parameter p, a solution like the semi sparse case can be achieved i.e., outperforming both the filters. Finally, the robustness of 
the proposed methods against variable sparsity and coloredness is verified by the detailed simulation studies. 

H. Problem Formulation, Proposed Algorithm and Performance Analysis 

We consider the problem of identifying an unknown system (supposed to be sparse), modeled by the L tap coefficient vector 
Wopt which takes a signal u{n) with variance as the input and produces the observable output d{n) = u^(n)wopt + e(n), 
where u(n) = [M(n), u(n — 1),..., u{n — M + 1)]^ is the input data vector at time n, and e(n) is the observation noise with 
zero mean and variance In order to identify the system, the approach mentioned in [TS] is followed to deploy an adaptive 
convex combination of two adaptive filters as shown in the Fig. 1, where filter uses the APA/ZA-APA algorithm to adapt 
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a filter coefficient vector w; (n) as follows, 


Wi(n + 1) 


w;(n) + n\]{n)(^elM +U^(n)U(n)^ e;(n) - pi sgn{wi{n)) 


( 1 ) 


where p is the step size (common for both the filters), controller constant pi = 0, when I = 1 and pi = p, for I = 2 , also 
e/(n) =d(n)—y;(n) = [ej(n), e;(n—lej(n—M+l )]^ is the respective filter output error vectors with y;(n) =U^(n)wi(n) 
denoting the respective filter output vectors, and U(n) = [u(n),u(n — 1), ...,u(n — M + 1)]^ is the input data matrix. The 
convex combination generates a combined output vector y{n) = X{n)yi{n) + [1 — X{n)]y 2 {n). The variable X{n) G [0, 1], is 
a mixing parameter, which is to be adapted by the following gradient descent method to minimize the quadratic error function 
of the overall filter, namely e^(n), where e(n) = d{n) — y{n). However, such adaptation does not guarantee that A(n) will 
lie between 0 and 1. Therefore, instead of A(n), an equivalent variable a(n) is updated which expresses A(n) as a sigmoidal 
function, i.e., A(n) = update equation of a(n) is given by [TS] , 

a{n + 1) = a(n) - ^ = “(”) + da e{n) [yi(n) - j/ 2 (n)]A(n)[l - A(n)] (2) 


In practice, A(n) ~ 1 for a{n) ^ 0 and conversely, A(n) « 
±c», it is sufficient to restrict it to a range [—a"*",+ 0 +] (a~^ 
X{n) to [1 - A+, A+], where A+ = 


0 for a{n) <C 0. Therefore, instead of updating the a{n) up to 
: a large finite number) which limits the permissible range of 



Figure 1. The proposed adaptive convex combination of two adaptive filters ( Filter 1 : ZA-APA and Filter 2 : APA 


A. Important Assumptions and Definitions 

In [19], |20] convergence analysis of APA was presented using an equivalent form of APA known as Normalized LMS with 
Orthogonal Correction Factors (NLMS-OCF), along with the assumptions proposed by Slock in [21] for modeling the system 
input. The analysis in [19] provides the basis for the work presented in this paper. Some introduction about the NLMS-OCF 
algorithm is presented below. For the given input signal u{n), observation noise e(n) and the reference signal d{n), the NLMS- 
OCF algorithm uses orthogonal correction factors based on the past M — 1 input vectors (each of length L), to update the 
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weights w(n), in every iteration. The weight update equation is 


w(n + 1) = w(n) + fiQ u(n) + fii u^(n) + ... + ^(n) 


(3) 


where the notation u^(n) refers to the component of vector u{n — k) that is orthogonal to the vectors; u(n), u(n— 1),..., u(n — 
k + 1), where fc = 0,1,..., M — 1 is the previous input signal vector index. 

And also for fc = 1, 2,..., M — 1 is chosen as, 




^e(n) 

u^(n)u(n) ’ 
(u^(n))^u^(n) ■ 


where 


for fc = 0, if ||u(n)|| ^ 0 

for fc = 1, 2,..., M — 1, if ||u^(n)|| 7^0 


(4) 


e(n) = diri) — w^(n)u(n) 

e^(n) = d{n — fc) — (w*^(n))^u(n — fc) for fc = 1, 2,..., M — 1 (5) 

w''(n) = w(n) + /To u(n) + u^(n) + ... + u^“^(n) 

The algorithm can be seen as the process of computing weight updates, using NLMS, based on the current input data vector, 
u(n), as well as the orthogonal components from each of the previous M — 1 input data vectors. 

In addition to NLMS-OCF, as described in [19] the performance analysis is done based on the following assumptions on 
the signal and the underlying system 

Al) The input signal vectors u(n) are assumed to be zero mean with covariance matrix 


R = E[u{n)u^{n)] = VAV"^ 


( 6 ) 


where A = diag{Xi, X 2 , ■■■, Xl), and V = (vi, V 2 ,..., Vl). Here, Ai,A 2 ,...,Al are the eigenvalues of input covariance 
matrix R and Vi, V 2 , ...,Vi are the corresponding orthonormal eigenvectors (V^V = I) i.e., V is a unitary matrix. 

A2) Observation noise e(n) (i.e., zero mean white noise, with variance is independent of u(n) and, the initial conditions 
?Ui(0) and a(0) are also independent of u(n), d{n). 

A3) The random signal vector u(n) is the product of three i. i. d. random variables, that is 


u(n) = s{n)r(n)v(n) 


(7a) 


where 

P{s{n) = ±1} = i 

r(n) - ||u(n)|| (7b) 

p{v(n) = vj z = l,2,...,L 


here r(n) ~ ||u(n)|| i.e., r(n) has the same distribution as the norm of the true input signal vectors. 
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Assumption A3 was first introduced by Slock in EH, which leads to a simple distribution for the vectors u(n) while 
following the actual first and second order statistics of the input signal in A1 consistently. Assumption A3 was used in [T9| . 
[ 20 ] as well as here, to To simplify the convergence analysis. 

As mentioned in ES], using Assumption A3, the weight update in (O-© can be simplified, since the computation of 
orthogonal components u^(n) becomes unnecessary i.e., each u(n) is already chosen from the orthogonal set. Hence, NLMS- 
OCF update equation can be rewritten as follows 


w(n + 1) = w(n) + fio u(n) + u(n - 1) + ... + Mm-i u(n - M + 1) 
e(n) = d{n) — w^(n)u(n), and 

e^(n) = d{n — k) — w^(n)u(n — k) for /c = 1, 2,..., M — 1 

where 

for A: = 0, if ||u(n)|| ^ 0 
for fc = 1 , 2 ,...,M- 1 , if ||u'=(n)|| 7 ^ 0 

Therefore, using the above NLMS-OCF approximation, weight update equations of Filterl and Filter2 are as follows 
APA weight update Equation: 


Mfe = 


//e(n) 

u^(n)u(n) ’ 

/ie^ (n) 
u'-^' (n—A;)u(n 


( 8 ) 


(9) 


Wi(n + 1) = Wi(n) + fiQ u(n) + u(n - 1) + ... + Mm-i u(n - M + 1) (10) 

and ZA-APA weight update equation: 

W2(n + 1) = W2(n) + ^0 u(»^) + Ml u(n - 1 ) + + Mm-i u(n - M + 1) - p sgn(y/2{n)) (H) 

where can be calculated using I©. 

Next, as in [TS], certain definitions that are useful in the analysis are presented below. For ^ = 1,2, we thus define, 

a) Weight Error Vectors: Wi{n) = Wopt — w/(n); 

b) Equivalent Weight Vector for the combined filter: Wc(n) = A(n)wi(n) + [1 — A(n)]w2(n); 

c) Equivalent Weight Error Vector for the combined filter: Wc(n) = Wopt — Wc(n) = A(n)wi(n) + [1 — A(n)] W 2 (n); 

d) A Priori Errors: ea,i{n) = u’^(n)w;(n) and ea{n) = u^(n)wc(n). Clearly, ea{n) = X{n)ea,i{n) + [1 — A(n)] ea,2{n) and 
e(n) = Cain) + e{n); 

e) Excess Mean Square Error (EMSE): Jex,i{n) = E[ea for ( = 1,2 and Jexin) = £’[e^(n)]; 

f) Cross EMSE: Jex,i2{n) = E[ea,i{n)ea,2in)] < sjJex,i{n)^JJex,2(n) [from Cauchy-Schwartz inequality]. This means 
Jex,i2{n) cannot be greater than both Jex,iin) and Jexgin) simultaneously. 

From ©, one can write. 


E[a{n + 1)] = E[a{n)] + po E[e{n) (t/i(n) - y 2 (n))A(n)(l - A(n))] 


(12) 
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As shown in m, the convergence of E[a{n)] in (fTSl i depends on the steady state values of the individual filter EMSE and 
the cross EMSE, namely, Jex z(oo) = lim Jex lin) and Jex 12(00) = lim J^x 12(tt) respectively. In practice, both Jex i{n) 
and Jex,i2{n), however, take only finite number of steps to reach their steady state values as both the APA and the ZA-APA 
algorithms converge in finite number of iterations. Substituting e{n) = ea{n) + e{n) in (O, where ea{n) is defined above, 
noting that yi{n) — 2/2(ft) = ea,2{n) — ea,i(n) and also that i?[e(n)] = 0, and assuming like [TS] that in steady state, A(n) is 
independent of the a priori errors ea,i{n), it is easy to verify that for large n (theoretically for n — 00), 

E[a{n + 1)] = E[a{n)] + iiaE[X{n) [1 — A(n)]^]AJ2 — /ia£’[A^(n) [1 — A(n)]]AJi (13) 

where AJi = Jex, 1 ( 00 ) — Jex, 12 ( 00 ) and AJ 2 = Jex, 2 ( 00 ) — Jex, 12 ( 00 ). By assuming AJi and AJ 2 are constant ( i.e., APA 
and ZA-APA algorithms have converged) Eq. (fOl l. can be used to yield the dynamics of the evolution of E{a{n)]. To analyze 
the convergence of E[a{n)\, we need to evaluate AJi and AJ 2 of the proposed combination. 

III. Performance Analysis of the Combination 

In this section, we examine the convergence behavior of the proposed convex combination for various levels of sparsity i.e., 
how the filter coefficients vector Wc(fi) of the proposed combination adapts dynamically to an optimum value as dictated by 
the sparsity of the system. Eor this, first we evaluate Jex, 2 ( 00 ) and Jex, 12 ( 00 ). 


A. Mean Convergence Analysis of the ZA-APA Algorithm 

Prom Ea. ilnT l. the recursion for the weight eiTor vector of the ZA-APA algorithm can be written as follows: 


W2(n -f 1) 




u(n-j)u^(rt-j) 

u^(fi-i)u(fi-j) 


W2(n) -p ^ 


\i{n - j)e{n - j) 
u^(n - j)u(n - j) 


-C p sgn{y/ 2 {n)) 


(14) 


where J„ C {0,1,..., M — 1} is the set of M or fewer indices j for which the input regressor vectors u(n —j) are orthogonal 
to each other. The orthogonalization process determines the indices forming the set Jn- The equation (fT4b forms the basis 
for the performance analysis of the ZA-APA algorithm. Using the statistical independence between W 2 (n) and u(n) (i.e., 
“independence assumption”), and recalling that e(n) is zero-mean i. i. d random variable which is independent of u(n) and 
thus of W 2 (n), one can write 


£;[w2(n -I- 1)] 


Il-pe[Y^ 

j&Jn 


u{n- j)u'^{n- j) - 
(n - j)u{n - j). 


E[w 2 in)]p E sgn(y/ 2 {n)) 


(15) 


Using A3), we can write the outer to inner product as, 

u(n-j)u'^(n-j) _ s(n - j)r{n - j)v{n - j)v'^{n - j)s{n - j)r{n - j) 
u^(« - j)u(ii- - j) s^(^ - - j)\\v{n - j)P 

= v(n-j)v^(n-j) 


(16) 














where v{n — j) G {vi, V 2 , Note that the above result is independent of the norm of u(n — j). 

Using the result presented in (fThl l. the Eq. (fTSl) can be rewritten as 
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E[w2{n + 1 )] 



^iE[ Vkvl 

k^Kn 


E[vi2{n)\ + p E sgn(w2(n)) 


(17) 


where 




{k : Bj € Jn9 


u(n-j)u'^(n-j) 
j)u(n- j) 


Ti 

^k^kl 




(18) 


By defining the vector a.(n) as the representation of i<^[w 2 (n)] in terms of the orthonormal vectors {vi,V 2 ,Vl}. That is 
0 : 2 ( 77 ) = V^ii;'[w 2 (n)] and 02 , 7 ( 71 ) = vf£'[w 2 (n)]. 

Using this notation, pre-multiplying ( fTTT i by vf results in 


vf£’[w2(n + 1)] = 


From the orthonormality of v^’s we have 


1l - pE 


T T 

2^ VkVk 


k&Kr, 


E[y/ 2 in)] + pvjE spn(w 2 (n)) 


T T 


vf, if iG Kn 


kGK„ 


0, if 7 ^ Kn 

Using the above result and substituting P{i G Kn) = Pi, Eq.(fT9]l becomes 


(19) 


( 20 ) 


0:2,7(77 + 1 ) = [1 - AtA] 02 , 7 ( 77 ) + ^67(77) 


( 21 ) 


where 67 ( 77 ) = vf E sgn{vv 2 {n)) and the term Pr{i G Kn) = Pi in (ISTT l is the probability of drawing an eigen vector V 7 
from the eigen vectors set {vi,V 2 , v^} at most in M trials. If we assume pi is the probability of drawing an eigen vector 
V 7 , then Pr(i G Kn) = Pi = I - {I - Pi)^. 

In steady state i.e., as 77 — ?> 00, from (EB, it is possible to write 


lim 02,7(77) = -Y lim 67(77) (22) 

n^oo ’ 

To evaluate lim 67(77), we classify the unknown parameters (i.e., filter taps) into two disjoint subsets same as in [Ts], denoting, 

71—^CO 

NZ and Z for active and inactive filter taps respectively. If the tap of the optimum filter is active, then i G NZ, else 
i G Z. For sufficiently small controller constant p, for every i G NZ, we have 1762,7(00)1 <C |77;opt,7|. On the other hand for 
every i G Z,v^q have 1762 , 7 ( 00 )! > | 7 Uop 7 , 7 |- Since Wopt,i = 0 for every i G Z, so for zero mean Gaussian input, in steady state 
we can assume 7772,7(00) to be Gaussian with zero mean. 
















Therefore, from the aforementioned remarks and for the sufficiently small controller constant p, we can approximate. 


lim sgn{w 2 i(n)) = sgn(wopt i) for i G NZ 

’ ’ ^23) 

lim E [sgn{w 2 ,i{n))] = 0 for i G Z 

n—^co ’ 

These approximations are helpful to evaluate the expectations of non linear function lim bi{n), involved in (1221) for both white 

n—¥oo 

and correlated inputs. To derive the closed form expressions for the steady state mean of the weight deviation ( 0 : 2 , 1 ( 00 )) of 
active and inactive coefficients, at this stage, we assume that the ZA-APA algorithm is operating on white input. This assumption 
simplifies the approach and the eigen vectors set {vi,V 2 , •■., Vl} becomes the trivial basis {ei,e 2 , ...jCl}. Therefore, in steady 
state for the ZA-APA algorithm, we have. 


lim a 2 ,i{n) 

n—^oo 


lim E[w 2 ,i{n)] = 

n—>-oo 


{ j^sgn{wopt,i) 


for i G NZ 
for i G Z 


(24) 


B. Excess Mean Square Error Analysis of the ZA-APA Algorithm 

The EMSE of ZA-APA algorithm can be written in the following form. 


Je.. 2 (n) = E[el^{n)] = E;[w^( n) R W 2 (n)] = Tr (R K 2 (n)) = Tr (A V^K 2 (n)V) 


(25) 


where i?[w 2 (n)w 2 (n)] = K 2 (n). 

Let us define the diagonal elements of the transformed covariance matrix V^K 2 (n) V as X 2 ,i{n) for z = 1, 2,..., L. That is 


[V^K2(n)V]ii = vfK2(n)vi = X2,i{n) 


(26) 


with the above notation EMSE of the ZA-APA algorithm can be written as follows: 

L 


Jexpijt) — ^ ^ Xi X2y{Tl') 


(27) 


i=l 


Using the statistical independence between Wi(n) and u(n) (i.e., “independence assumption”), and recalling that e(n) is of 
zero-mean and also independent of u(n) and thus of Wi(n), one can write the recursion for the mean square deviation of 
ZA-APA algorithm as follows: 


K2(n-f l) = K2(n)- pE 


+ p^ E 


E 


j&Jr, 


u{n - j)u^(n - j) 
{n - j)u{n - j) 


K2(n) — p K2(n) E 


E 

.m£Jr, 


u(n — rnjvP" {n — m) 
(n — m)u(n — m) 


+ p^ E 


u{n - j)u^{n - j) \ / E “ m)u^{n - m) 

u^(r> — I ^ \ 




E 


{n - j)vi{n - j) 
\i{n - j)e{n - j) 


\m^Jj 


u^(n — TO)u(n — m) 


u(n — m)s(n — m) 
“ 7 ) j “ w)u(n - m) 


E 
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+ p E 


+ p E 


Il-P [Y. 




u{n - j)u'^{n - j) 
(n - j)u{n - j) 


sgn{'W 2 {n)) W 2 (n) 


E 


Il-p(Y 


\m^Jr 


E [w2(n) sgn{w2 W)] 

u(n — m)u^ {n — m)\ 
u'^{n — m)u{n — m) j 


+ p^ E \sgn (w 2 (n)) sgn (w^ H)] 

Using ( fTSl l to replace the outer-to-inner product ratios with Vkv"[, one can have 


K2(n + 1) = K2(n) - 


I] 


Lk^Kn 


K2(n) - iiK2{n)E 


E 

.m^Kn 


ymVrn, 


+ p^E 
+ pE 
+ pE 


E '^kvl ) K2(n) ( Y 


L \kGKr^ 


\m^Kn 




Y 


.k^Kfi 


Il-P Y 


kGKn 


E [w 2 (n) sgn{w'^{n))] 


E 


II- P Y 


m^Kn 


sgn(y/ 2 in))y /2 (n) 

+ p'^ E [sgn {■W 2 {n)) spn(w^(n))] 

With the notation presented in (l26T l. pre and post multiplication of (l29l l by vf and Vi respectively, results in 


A 2 ,i(n + 1) = A 2 ,i(n) - pE 


vf ( E 


\k^Kn 


K 2 (n)vi - pvjK 2 {n)E 


Y 


_ \mGKri 


+ p^E 
+ pE 
+ pE 


Y K2(n) Y h 

Vfee-ftn / \mGKn ) . 




vH E 


Vkvl V, 


KkGKn 


vf Ul-P Y ) 

V feeifn / 

vf (spn(w 2 (n))w^(n)) 


E [{■W 2 {n) sgn{-wf {n))) Vi 


E 


Il-pY 


VmVm Vz 


mGKn 


(28) 


(29) 


(30) 


+ P^ E [vf (sgn (w 2 (n)) sgn (wf (n))) v*] 

From the orthonormality of v^’s, using the result presented in (l20l i and substituting P{i G Kn) = Pi, Eq.(0 becomes 

A2.i(n + !) = [!- p(2 - p)Pi] A2,i(n) + p^^°E[^]Pi 

' --V-' 

Ai,i(n + 1) of APA [TS] 

+ p(l-^^i) [vf 4>(n) Vi]+p{l-pPi) vf $^(n) Vi + p^ [vf ’4'(n) Vi] 


(31) 


A 2 ,i(") 


where #(n) = E [w 2 (n)spn(wf (n))] and (!> = £’ [spn(w 2 (n)) spn(wf (n))] 
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In steady state i.e., as n —>■ oo, from (EB, it is possible to write 

1 , 


A2.i(oo) = 


2 — fj, 




+ 


Ai,i(oo) of APA 
p(i - fj-Pi) 


m(2 - ^J.)|3i 


[vf ^(OO) Vi] + 


pjl- nPi) 
fi{2 - p)P, 


vj $^(oo) Vi 


P 


p{2 - p)Pi 


[vf ’S'(oo) V,] 


(32) 


A2,i(oo) 

where $(oo) = lim E \w 2 in) sgn(\vf (n))] and’S'( 00 ) = lim E \sgn(\V 2 {n)) sgn(\vf (n))] 

To evaluate S>(c») and ’®'(c»), as mentioned in the convergence in mean section the unknown parameters (i.e., filter taps) are 
classified into two disjoint subsets denoting, NZ and Z for active and inactive filter taps respectively. Under small misadjustment 
condition ( i.e., the steady state standard deviation of W 2 ,i ( <7^2 i) very small), also following the lines of [^, we can 
approximate lim E[w 2 ,i(ji)sgri(w 2 j{n))] = lim E[w 2 i(ji)]E[sgn(w 2 ,j{n))] and lim E[sgn{w 2 i{n)) sgn{w 2 ,jin))] = 
lim E[sgn{w 2 ,i{n))]E[sgn{w 2 ,j{n))] for all i ^ j. These approximations are helpful to evaluate the expectations of non 
linear functions involved in ( |32] | for both white and correlated input. To derive the closed form expressions for the steady 
state mean square deviation (A 2 ,i(c»)) of active and inactive coefficients, we assume that the ZA-APA algorithm is operating 
on white input. This assumption simplifies the approach and the eigen vectors set {vi, V 2 ,..., Vl} becomes the trivial basis 
{ei,e 2 , ...,ei,}. Therefore, in steady state for the ZA-APA algorithm, we have. 




E [iU 2 ,z(oo)] sgn(wopt.i) 

E [iu2,z(oo) spn(r(;2.i(oo))] 


for i G NZ 
for f G Z 


(33) 


and 






E[ [sgn{wopt,i)Y 


= 1 


[spn(u;2.i(oo))]^ ] =1 

Using these results for every i G NZ we can write. 


for i G NZ 
for f G Z 


(34) 


A2.i(oo) = 




2 — fi 


eE[- 


Ai,i(oo) 


p^(2 - pPi) 
A2,i(oo) 


(35) 


On the other hand for i G Z, W2,i{oo) = Wopt,i — W2,i{oo) = —W2,i{oo). Assuming iyi_i(oo)and W2,i{oo) to be jointly Gaussian 
(having mean zero in the steady state, as Wopt,i{oo) = 0), using Price’s theorem we can write, E [sgn(t(; 2 .i(oo))'5;2,i(oo)] = 
-E[sgniw 2 ,iioo))w 2 ,iioo)] = -gE [w 2 ,iioo)w 2 ,iioo)] = -gE [{(; 2 ,i(oo)u; 2 ,i(oo)], where g = ^ 

With these results for i G Z, the equation (l32T i can be written as. 


A2,*(oo) = 


/i(2 - p)f3i E 2p{l - 


(36) 


V V2,i(oo) 
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the above equation is a quadratic of the form, 


a Ai,i(oo) + h 


Ai_j(oo) + c = 0 


(37a) 


where the coefficients are, 


0=1 

fi p{l - fiPi) 

V tt ^(2 - /r)/3i 


(37b) 


c = — 


^ cO TPr 1 1 , 


-eE[^] + 


^2-/r /r(2-^)/3i^ 

Then, for i G Z, we have ( since 0-^)2 i positive, note that only real, positive root has to be considered.) 


A2,j(oo) = 


1 


-\l ZP(^ ~ pPi) + \ ~ j ^(2 - 


(38) 


m(2 - 

Squaring both the sides of the above equation, the steady state mean square deviation of single inactive tap can be written as, 

p2 


A2,*(oo) = Ai,i(oo) + 

APA 


8 / 2 / 
ttP lt{2-fj,)/3iy P 


M=(2-m) = /3? 


-) + Ai,i(oo) 


(39) 


Ai,i(o 


T / \ p(i - 

= M,i[oo)-\ - ——-—yAi,i(oo) 




TT p(2 - 

From the above equation, it can be observed that for having A 2 ,i(oo) < Ai_i(oo), the controller constant p has to be chosen 
very carefully. By using (1^ and EMSE of ZA-APA can be calculated. 

1) p Range: Eor a system with given length L and projection order M, Jex,2{oo) < Jex,i{oo), if and only if 


p < - - — 

^ - TT C 


(40) 


where 


C={K{L-K){2- pA) (f (1 - + 2p(2 - m)A) ) + ( K^(2 - - p) ) 

- ( {L- k)^{2- p)p^l3f ) 

This bound shows the dependence of regularization parameter p value on the projection order M. 


C. Cross Excess Mean Square Error Analysis of ZA-APA and APA Algorithms 

The cross EMSE of the proposed combination can be written in the following form, 

Je.,i 2 (n) =i^[w^(n)Rwi(n)] =Tr(RKi 2 (n)) =rr(AV^Ki 2 (n) V) (41) 

where i?[wi(n)w^(n)] = Ki 2 (n). 
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By defining the diagonal elements of the transformed covariance matrix Ki 2 {n)\ as Ai 2 ,i(n) for i = 1, 2, L. That is 


[V'^iTi2(n)V]ji = vjKi2{n)vi = X12,i{n) 


With the above notation cross EMSE of the combination can be written as follows: 

L 


Jex,12{n) = Al2,i(t^) 


i=l 


Post multiplying Wi(n + 1) by W 2 (n + 1), taking expectation, and using assumptions A2) and A3) we get 


Ki2(n + 1) = Ki2(n) — pE 

w 

1_ 

Ki2(n) - 

pKi2(n)E; 

Y 


.k^Kn 



.m^Kn 


+ IJ^E[{ ^ Vfevn Ki2(n) [ I] + ]E; 


\k£Kn 


\m^Kn 




.kGKn 


-\- p E 


II- fJ. Y 


k€Kn 


E [wi(n)s5n(w^(n))] 


With the above notation, the pre and post multiplication of (l44l l by vj and Vi respectively, results in 


Ai2.i(n + 1) = Ai2,i(n) - 




\k^K„ 


+ pE 
+ pE 


vi[ Y *^12 (’^) E 


Ki 2 (n)vi - p.vf Ki 2 (n)£’ 

+peE[hE 




\kGKn 


\mGKn 


Vfcetfn / 

'’f (E 


Vi 


Kk^Kn 


vj \Il- p Y 


Vkvl 


k^K-n 


E [wi(n)spn(w^(n))] n 


Erom the orthonormality of v^’s and substituting Pr{i G Kn) = pi, Eq. ( l45l l can be written as. 


Ai 2 ,i(n + 1) = Ai 2 .i(n)[l - p{2 - p)Pi] + E[^]Pi + p[l - pPi\E vf (wi{n) sgn{w 2 {n))^v. 


Ai,i(n + 1 ) Ai 2 ,i(n) 

In steady state i.e., as n — >■ c», the term Ai 2 ,i(n) becomes time invariant. Therefore, we have 


Ai 2 .*(oo) = E[^] + ^EY^E vf (wi(oo)spn(w^(oo)))vi 


Ai,i(oo) 


Ai 2 ,i(oo) 


(42) 


(43) 


(44) 


(45) 


(46) 


(47) 


vf ( 00 ) sgn{yvf ( 00 ))) v, 


Using the same approximations used in EMSE analysis of ZA-APA, we can solve the term E 
the above equation for white and correlated input. However, to derive the closed form expressions for both active and inactive 
coefficients, we are assuming that the algorithms are operating on white input. 

Using the fact that lim E \wi i{n)] = 0, for every i G NZ one can write, 

Ai 2 .i(oo) = [wi,i(oo)s 5 n(w 2 ,i(oo))] 


p{2 - p)Pi 

P (1 - pPi) 

p{2 - p)Pi 


E[wi^i{oo) sgn{wopt,i{<x)))] = 0 


(48) 
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and using Price’s theorem for every i & Z can write, 


Ai 2 ,i(oo) = ^ [wi,j(oo)s 5 n(w 2 ,*(oo))] 


/i(2 - /r)/3i 

p(l - liPi) I ¥ 


Ai 2 ,i(oo) 


m(2 - fJ.)/3i Y 7rcr2 ^ _ 
p{l - fiPi) [2 Ai2,i(oo) 


From (|47]) . for every i G NZ the mean square deviation can be written as, 

2- p 


and for i G Z, 




[p.{2 - fi)l3i] Ai 2 ,i(oo) = - p{l - 

V TT 


\J A 2 ,i(oo) 


By rearranging the terms in the above equation, we get 

Ai2,i(oo) = 




[p{2 — p)f3i] + 


P(l-, 


A2,i(oo) 


From ( 1^ and ( l50b 


Al2,i(oo) = 


l3iE[:p;]J X2,^{oo) 


p{2 - p)l3i\/ A2,i(oo) 


= Ai,i(oo) — 


p{l- pPi)^I 
+ \J IP^(l - + {^E|3^^^E[E] + p^) p{2 - p)Pi 


= Ai,i(oo) - Ai,i(oo) 

= Ai,i(oo) - Ai,i(oo) 

- Ai,i(oo) - Ai_i(oo) 


\/ f P^(l - + {tJ-'^l3i^°E[:^] + p2) ^(2 - /r)/3i 

^ - pf3i) 


[V+ {^^'^Pi^^E[^] + p^) p{2 - p)/3i 


I P 




P" <P 


A<3/32(2-^){0£;[ 1 


By using (l50l l and (l5?t . cross EMSE of the proposed combination can be calculated. 


(49) 


(50) 


(51) 


(52) 


(53) 
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IV. Convergence Analysis of a(n) for Various Cases of Sparsity level of the system 
1) Non-Sparse Systems: For non-sparse systems, the set Z ( set of zero taps) contains very small number of coefficients. 
Therefore from and (l50l) we have Ai^i(oo) ~ Ai2,i(oo) and Ai^i(oo) < A2,i(oo) or equivalently, Jex.i(oo) ~ Jex, 12 ( 00 ) 
and Jex, 1 ( 00 ) < Jex, 2 ( 00 ) . These imply AJi ft! 0 and AJ 2 = Jex, 2 ( 00 ) — Jex, 12 ( 00 ) = Jex, 2 ( 00 ) — Jex, 1 ( 00 ) > 0. Therefore, 
Eq. (fTJb leads to 


E[a(n + 1)] = E[a(n)] + HaE [A(n) (1 




Note that VA(n) G [0,1], the function f(X(n)) = A^(n)(l — X(n)) > 0, with a maxima at A(n) = | and with f(X(n)) = 0 at 
X(n) = 0,1. Assume that at the n-th index, —a+ < E[a(n)] < 0 +, meaning that a(n) has not converged to a+ or —a+ (in all 
trials), but is taking values from [—a+, a+], or equivalently, X(n) has not converged to A+ or 1 — A+ but assuming values from 
[(1 — A+), A+]. For 1 — A+ < X(n) < A+, f(X(n)) > /(I — A+) = C. Substituting in (l54l l. A[a(n +1)] = E[a(n)] + p,aCAJ 2 . 
Since A J 2 > 0, the above implies lim E[a(n)] = and thus lim a(n) = almost surely. The results in lim A(n) = A+ 

n—^oo n—>oo n—¥co 

(almost surely) ft 1, and therefore the proposed combination switches to APA algorithm (Filter 1) which performs better than 
ZA-APA algorithm (Filter 2) (i.e., Jex,i(oo) < Jex,2(00)). 

2) Semi-Sparse Systems: In semi-sparse systems the zero taps set Z contains signihcant amount of elements when compared 

^ - in the equation (|5^ 


to non-sparse system. For the given controller constant p and 0 < p < 2, the term 




becomes positive. Therefore, for every i € Z, we have Ai2,i(cx3) < Ai,i(c») i.e., Jex, 1 ( 00 ) —Jex, 12 ( 00 ) > 0. And also, due to the 
presence of large number of non-zero coefficients still APA performs better than ZA-APA algorithm i.e., Jex, 2 ( 00 ) > Jex,i(oo). 
So, we have Jex, 2 ( 00 ) > Jex, 1 ( 00 ) > Jex, 12 ( 00 ) and thus both AJi > 0 and AJ2 > 0. This is analogous to the case (3), 
section III of [TS]. Under this, a stationary point is obtained by setting the update term in ( IT3] ) to zero as n — 00 , leading 
to i?[A(oo)(l — A(oo))^]AJ 2 = A[A^(oo)(l — A(oo))]AJi. Assuming a negligibly small variance for A(oo), i.e., assuming 
i?[A^(oo)] —)■ 0 which implies that A(oo) —?> a constant (almost surely) as n —>■ 00 one can then obtain from the above 
(1 — i 5 [A(oo)])AJ2 = A[A(oo)]A Ji, or equivalently, 

n A+ 


j;[A(oo)] = 


A J2 


A Ji -f A J2 


(55) 


1 —A+ 


It follows that 


As proved in 


A'*' > A(oo) > 0.5; if 

Jex, 1 ( 00 ) < 

Jex, 2(00) 

0.5 > A(oo) > 1 - A+; if Jex,l(oo) > Jex,2(oo) 

], this case is not sub-optimal. Rather it leads to 


(56) 


Jex(oo) < min[Jex,l(oo), Jex,2(oo)], 


(57) 


which means in this case, the proposed convex combination works even better than each of its component biters. 
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3) Sparse Systems: For sparse systems, Z (set of in active taps) contains the majority of the coefficients and NZ ( set of 
active taps) contains very negligible number of elements. From ( |39] |. we have A 2 ,i(oo) < Ai_i(oo), or equivalently, Jex, 2 {oo) < 
Jex,i{oo), i.e, the ZA-APA algorithm in this case outperforms the APA algorithm. Depending on the value of the controller 
constant p, Jex, 2 {oo) < Jex, 12 ( 00 ) and Jexpioo) > Jex, 12 ( 00 ) are possible. From ( |39] | and (1531) . we have 


Jex^2 


^ “ m( 2 - ft)/3i + 1(1 - 


(58) 


Case I is the consequence of the above situation as we discuss in the following paragraph. 

Case I : Jex,2{oo) < Jex,12(00) < Jex,i{oo), that imply AJ 2 < 0 and AJi > 0 
This is analogous to the case (1), section III of [TS]. Equation (fTSl) in this case leads to 


E[a(n + 1)] = E[a(n)] + p,aE[g{X{n))]AJ 2 - PaE[f{X{n))]AJi, (59) 

where g(X(n)) = A(n)[l - A(n )]2 = /(I - A(n)). Like /(A(n)), giX(n)) > g{X+) = /(I - A+) = C',VA(n) G [1 - A+, A+]. 
From the arguments used for the non-sparse case above, we have, E[a(n + 1 )] < E[a(n)] — paC(AJ 2 — AJi), which means 
lim E[a{n)] = — 0 + and thus lim a(n) = —a~^ (almost surely), or equivalently, lim i?[A(n)] = 1 — A"*" ( almost surely ) 

n—>00 n—^00 n—¥co 

Ri 0. The combination filter in this case will be converged to the ZA-APA based filter which is better of the two filters for 
sparse system. 

For higher values of p, the cross EMSE eventually becomes relatively less than the EMSE of ZA-APA which gives case II. 
Case II ; Jex, 12 ( 00 ) < Jex, 2 ( 00 ) < Jex, 2 ( 00 ), meaning AJi > 0 
This is again analogous to the case (3), section III of [18]. Using the arguments used for the semi-sparse case above, (l55l l and 
thus (I 57 I 1 will be satisfied in this case, meaning the proposed combined filter will perform better than both filterl and filter2. 


V. A New Approach to Increase the Convergence Rate oe the Combination in Sparse System Case 

In sparse system case, ZA-APA provides better steady-state EMSE performance, however, cannot improve the convergence 
rate. The proportionate-type concepts can be incorporated into the ZA-APA algorithm to attain both improved convergence rate 
and steady-state EMSE performance simultaneously. Therefore, we are using Zero Attracting Proportionate Affine Projection 
Algorithm (ZA-PAPA) as filter2 in the proposed combination. The weight update equation of the ZA-PAPA algorithm is as 
follows, 

W2(n-I- I) = W2(n)-f ^ G(n) U(n)^elM +U'^(n)G(n)U(n)^ 62(11) - p sgn (^ 2 ( 11 )) (60) 

where G(n) is a diagonal gain matrix that distributes the adaptation energy unevenly over the filter taps by modifying the step 
size of each tap. 

The gain matrix G(n) is evaluated as. 


G(n) = diag(go(n),gi(n), ...gL-i(n)) 


(61) 
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where, 

9 i{n) = -, 0<1<{L-1) (62) 

i E liin) 

1=0 

with, 


7 i(n) = max[p^rnin{n),F[\wi{n)\]] 


^min (n) = TOaa;((5,F[|wo(n)|],F[| w;l-i( n)|] 


(63) 


where p is a very small positive constant which together with ‘jminin), ensures that 7 ;(n) and thus gi{n) do not turn out to 
be zero for the inactive taps and thus the corresponding updation does not stall. The parameter S is again a small positive 
constant employed to avoid stalling of the weight updation at the start of the iterations when the tap weights are initialized to 
zero. 

A full understanding of the APA and ZA-PAPA combination requires a steady state EMSE analysis for a{n) or A(n), 
which is beyond the scope of this correspondence. Detailed simulations of the convex combination of ZA-PAPA and APA are 
presented hereunder. 


VI. Simulation Studies and Discussion 

Here the analytical results presented in section IV are compared with the simulations for system identification example. The 
proposed algorithm has been simulated for identifying the system (Wopt) of length L = 256 taps. Initially the system is taken 
to be non-sparse with all coefficients (randomly chosen) having significant magnitude. After 6000 time samples, the system 
is changed to a semi-sparse system with 80 active taps and the remaining coefficients magnitude equal to zero. Einally, after 
12000 time samples the system is changed to a highly-sparse system having 16 active taps with the remaining coefficients 
being inactive. 

Simulations were performed using zero mean, Gaussian white noise and first-order auto-regressive (AR(1)) process having 
a pole at 0.8 with unit variance (cr^ = 1 )• The observation noise e(n) was taken to be zero-mean Gaussian white noise with 
variance = 0.001. Projection order (M) was taken to be 8 for both the APA and the ZA-APA algorithms and the initial 
taps were chosen to be zero. The controller constant p was taken to be 8 x 10“® and 3 x 10“® for white and colored input 
cases respectively. The simulation results shown in Pig. 2 and Pig. 3 are obtained by plotting the EMSE against the iteration 
index n, by averaging over 1000 experiments. 

Several interesting observations follow from Pig. 2 and Pig. 3. The plots confirm our conjuncture that for non-sparse system 
Jex,i{oo) ~ Jex,i2{oo) while for all other cases (i.e., sparse and semi-sparse systems), Jex,i{oo) > Jes, 12 ( 00 ). Secondly, for 
the chosen value of p, it is seen that Jex, 2 {oo) < Jea:, 12 ( 00 ) when the system is highly-sparse. As per our analysis above, the 
proposed combination in this case should converge to the ZA-APA based filter, meaning we should have Jex{oo) ~ Jex, 2 {oo). 
Similarly, for the non-sparse system, it is seen from Pig. 2 and Pig. 3 that Jex{oo) « Jex,i{oo), meaning the convex combination 
in this case favors filterl, which confirms our arguments above. Pastly, for the semi-sparse system, the plots confirm that 
Jex,2{oo) > Jex,i{oo) > Jex,i2{oo). As per the discussions of the previous section, the proposed combination in this case is 
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Figure 2. Learning curves of APA and ZA-APA combination for white input with M = 8 


likely to produce a solution that performs better than both filterl and filter2. 



Figure 3. Learning curves of APA and ZA-APA combination for AR(1) input having pole a = 0.8 and M = 8 


Later, simulations were performed for the proposed convex combination of APA and ZA-PAPA. The learning curves of the 
proposed combination for white and colored input signal are presented in Fig. 4 and Fig. 5 respectively. From these figures, 
it can be observed that usage of proportionate concepts in ZA-APA provides high convergence rate in sparse system case. 


VII. Conclusions 

An algorithm that is robust against the correlated input is presented for identifying sparse systems with the degree of sparsity 
varying with time and context . The proposed method uses an adaptive convex combination of the sparsity aware ZA-APA 
algorithm and the standard APA algorithm. The algorithm adapts dynamically to the level of sparseness and exhibits robustness 
against the correlated input. 







































18 



Figure 4. Learning curves of ZA-PAPA and APA combination for White input with M = 8 
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Figure 5. Learning curves of ZA-PAPA and APA combination for AR input with M = S 
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